AITopics | large-scale transformer

Collaborating Authors

large-scale transformer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers

Neural Information Processing SystemsDec-24-2025, 23:56:59 GMT

How to efficiently serve ever-larger trained natural language models in practice has become exceptionally challenging even for powerful cloud servers due to their prohibitive memory/computation requirements.In this work, we present an efficient and affordable post-training quantization approach to compress large Transformer-based models, termed as \OURS.

efficient and affordable post-training quantization, large-scale transformer, name change, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

Amortized Planning with Large-Scale Transformers: A Case Study on Chess

Neural Information Processing SystemsMay-27-2025, 05:58:49 GMT

This paper uses chess, a landmark planning problem in AI, to assess transformers' performance on a planning task where memorization is futile -- even at a large scale. To this end, we release ChessBench, a large-scale benchmark dataset of 10 million chess games with legal move and value annotations (15 billion data points) provided by Stockfish 16, the state-of-the-art chess engine. We train transformers with up to 270 million parameters on ChessBench via supervised learning and perform extensive ablations to assess the impact of dataset size, model size, architecture type, and different prediction targets (state-values, action-values, and behavioral cloning). Our largest models learn to predict action-values for novel boards quite accurately, implying highly non-trivial generalization. Despite performing no explicit search, our resulting chess policy solves challenging chess puzzles and achieves a surprisingly strong Lichess blitz Elo of 2895 against humans (grandmaster level).

amortized planning, case study, large-scale transformer, (1 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Chess (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.84)

Add feedback

ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers

Neural Information Processing SystemsJan-18-2025, 13:08:57 GMT

efficient and affordable post-training quantization, large-scale transformer, zeroquant, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

Add feedback